We'll use some operators from Chapter 3. If you want to write images out or to see them as pictures, you'll need to normalise them:
From Chapter 4, we'll start using edge magnitude now, so we need the Sobel operator, and edge magnitude and direction
Feature Extraction & Image Processing,
M. S. Nixon and A. S. Aguado
Elsevier/ Butterworth Heinmann, 2nd Edition 2007
This worksheet is the companion to Chapter 7 and implements the basic feature description techniques described therein. The worksheet follows the relevant sections of the text directly.
This chapter concerns shape description which is the process by which we generate a set of numbers which describes a shape. We could use the set of points in a shape's border, but this set is usually too large. Also, it will change with position (translation), rotation and scale. We want our description to be invariant to these factors, so that we describe the shape itself.
We'll start with regions and borders, like the text. We'll first need a shape to analyse, let's use the one like
We'll first need routines to check whether a point is connected to any of it's 4 neighbours. We shall here assume that shape points are white (255) and the background is black (0). So for 4-way connectivity, we simply need to check if any of a (white) point's 4 immediate neighbours are also white.
So the function returns 1 when there is no white neighbour, and 0 when there is (this makes the software easier later).
We'll also need to check 8-way connectivity, to assess whether any of a (white) point's immediate neighbours are also white
Note that the point with co-ordinates x=2, y=2 has one non-white neighbour whereas the point with co-ordinates x=2, y=3 is the only white point surrounded entirely by white points.
In order to start the process of finding a shape's boundary, we need to start somewhere. We'll use a routine which scans the image to find the first white point. It actually scans the whole image, keeping the co-ordinates of the first point it found. Clearly, this can be speeded up by breaking the loop(s) when a point is found.
We can now assess connectivity and find the first point. Having found a point, we need to find if there's a connected point in any of the allowed directions. We will combine 8-way connectivity with 4-way neighbourhood analysis, and 4-way boundary connectivity with 8-way region analysis. We'll do the latter of these first. Essentially, at each point, we look for a new point along the four compass directions. If no point is found then the routine returns co-ordinates for a new point which are the same as originally supplied, allowing the calling software to realise that no new point was found. The order of the search is so as to enforce clockwise analysis (by searching the anticlockwise points first, giving precedence to the clockwise points); the last point searched is towards East. The point at the original co-ordinates is set to a grey level in order to mark the completed boundary.
We can see that the grey point has been marked as the first
point and its connecting neighbour is to its right
So we now have the components necessary to find the points in a boundary. We find a first point, and then keep finding points until we can find no more:
We can also do this by combining 8-way connectivity with 4-way neighbourhood analysis. We have to redefine the earlier four way connectivity to check eight neighbours
So we can now find the border two different ways: 4-way connectivity gives a blocky effect whereas 8-way connectivity makes the shape to appear more discontinuous. Note that these are just the border which are points on the boundary. We'll now consider boundary descriptions, ways of describing the boundary we have just found. Let's generate a chain code by 4-way connectivity. Each bit of the code is assigned according to the connection between successive points in shape's boundary. In 4-way connectivity, the codes for particular directions are: North=0, South=1, East=2, and West 3. The chain is simply the series of successive code allocations. We'll reuse the earlier code and add in the chain code bits
It is naturally invariant to translation, so we obtain the same code wherever the shape is placed with an image.
We won't try start point invariance here, as that requires a shift operation (which must be fun in Mathcad!). We won't do rotation or scale invariance either, as that is beyond the scope of this software.
Now we'll look at Fourier descriptors which give a complete description of an object's boundary with the inherent advantages of a Fourier description (access to frequency components etc.).
First, we'll need an initial contour. Again We'll use the function ell_points to generate a set of no points
arranged in an ellipse, centre xc,yc with major and minor axes a and b. We'll use the snake format, including elasticity, curvature and edge-weighting, which we won't use for now, but we'll use it later.
Now we need to extract a set of complex numbers where the x and y co-ordinates are stored as x+jjy. To give the set, we'll use the function extract
So the first point is along the x axis, to the right of the centre, the points continue clockwise, so the second point is just beneath the first.
Here, the first component is the centre of the ellipse, expressed in complex form. The second component is the radius of a circle which best fits the data. The third component is zero, but remaining ones are not. In graphical form, this is
This gives a set of points in complex format. So we'll need to extract them as a contour so that we can use the same drawing algorithm, using the function ext_points
Now for a circle, the first descriptors are again the centre co-ordinates, the second is the radius (for the ellipse, the second co-ordinate is the average radius).
Now these descriptors have no invariance properties: if the shape is larger, or is rotated, then the description changes. Let's define a function which gives the descriptors as
This is actually three descriptions, the angular change, the cumulative and gular change, and the normalised angular function. Only the last of these is useable in shape description. Let's see how they go for a circle:
Let's see the difference between these descriptors and those of a real shape. We'll use the Greedy algorithm to find a shape since it can provide a nice set of points describing a border. We'll then use Fourier descriptors to describe the shape, and compare it with those of another. Let's read in the image of the club suit in playing cards
Hmmm, these look pretty similar to those of a circle, we'd better see if there is actually any difference.
So the centre information is in ax and ay, and is twice the size (due to the nature of the equations)
So ax and by only describe the ellipse. You can see the influence of noise in the higher order descriptors bx and by.
Let's have a look at the descriptors, they are clearly different from those of the ellipse! The descriptors bx certainly have value now.
That completes our study of perimeter descriptions. We'll now move to area based descriptions, those which describe regions.
So far, we have looked at methods for perimeter description. There are region descriptors which describe properties of area. We'll start by looking again at an ellipse, so we need a routine which gives an ellipsoidal shaped area. The function draw_ellip does just this, setting to white (255) any point inside an ellipse's boundary.
We need a function to turn the points into a vector of point co-ordinates. Extract_shape does just that.
The basic area descriptors are common geometrical properties, like area and length of perimeter. The area is simply a count of the points inside a shape and the perimeter length is the number of pixels in it.
The number of points in the perimeter depends on how the perimeter is extracted. If we extract it by four way analysis, the length will be larger than that for eight way analysis.
Naturally, a circle has the most compact shape whereas a line is the least compact. Let's check this
In fact, by theory, the compactness of a circle should be unity, but we are dealing with discretised shapes here. As such, the length of a line is measured as a count of the number of points rather than a pure measure of distance. In fact, that's the main problem as, in discrete images, the length of a line will change with rotation
The irregularity of a shape is defined using the maximum chord length which is the largest distance from the centre to a point on the boundary calculated as
but these will change with rotation. To avoid that, we could take the ratio of the major chord length to one taken perpendicular to it.
Now, moments are like Fourier descriptors and exist at different scales. So we'll define a set of pointers starting at zero and rising to order 20. We do this for the x axis (p) and for the y axis (q). The moments are defined as
When divided by the number of points in the shape and the brightness (i.e. the zero order moment), the first order moments give the x and the y co-ordinates of the centre of the shape
The second order moments are like the moment of inertia in mechanics and they describe a shape's distribution
So our ellipse appears to be distributed more about the y axis (the vertical axis) than it is around the x axis (the horizontal axis). Indeed it is. Let's look at the whole set for the x axis (for which y=0); as they grow in an exponential fashion, we'll plot the log of their value.
So we have something similar to Fourier descriptors: access to a description at differing scales. Unfortunately they will change as soon as we do anything to the shape. It's clear that only the zero order moment is invariant as the first order moments will change even with shift (translation). For this reason, we shall consider centralised moments, which are invariant to translation. We can define these centralised moments m as
In order to assess invariance to rotation and translation, let's define rotated and translated versions of the original ellipse
So they don't change with shift, but they rotate with the shape and neither are invariant to scale. That's not too surprising really, since their definition is only phrased in terms of invariance to change in position and that is what we see.
To get better invariance properties, we need normalised central moments h which for p+q>2 are calculated from the centralised moments as
Clearly, this is invariant with rotation, and with scale (the small difference is doubtless due to discretisation effects and is negligible). The second invariant moment is
Again, for an ellipse, this does not vary with rotation, translation or scale, but varies slightly with scale. (Again this is most likely due to discretisation effects. These could be ameliorated by processing shapes of much larger size but that is impractical here.)
Let's do it for a real shape, that of a plane. We'll use three silhouettes: one normal, one rotated and the other different
Clearly, the set of moments for the B1 bomber is different from the F14 fighter. Also, the rotated and scaled version of the F14 fighter gives very similar values, with small fluctuation due to numerical imprecision. Naturally, these numbers can be used to recognise the shapes. To do that, we need a classifier. That's Chapter 8: enjoy!